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O '■ Abstract 



We analyze and reinterpret experimental evidence from the literature to argue for an ability of tumor 
cells to self-regulate their aneuploidy rate. We conjecture that this ability is mediated by a diversification 
factor that exploits molecular mechanisms common to embryo stem cells and, to a lesser extent, adult stem 
cells, that is eventually reactivated in tumor cells. Moreover, we propose a direct use of the quasispecies 
, model to cancer cells based on their significant genomic instability (i.e. aneuploidy rate), by defining 

master sequences lengths as the sum of all copy numbers of physically distinct whole and fragmented 
chromosomes. We compute an approximate error threshold such that any aneuploidy rate larger than 
the threshold would lead to a loss of fitness of a tumor population, and we confirm that highly aneuploid 
cancer populations already function with aneuploidy rates close to the estimated threshold. 
Q ■ Keywords: Quasispecies, Cancer, Chromosomal instability, Aneuploidy. 

0" 1 Introduction 

Non-diploid chromosome content, also known as aneuploidy, is the most common feature of human tumor 
cells [Tp[3]. However, there has been dispute on whether aneuploidy and chromosomal instability, i.e. the 
■^j- ■ tendency to gain or lose parts of the genome during cell replication, give an advantage or disadvantage 

' to the tumor cells, with recent evidence strongly siding the former [4]. 

Less attention has been given to potentially beneficial roles of aneuploidy in developmental biology, as 
it is generally assumed to be a byproduct of aberrant cell division, with mostly lethal or negative impact. 



And yet, high levels of aneuploidy are associated to increased adaptability in plants and yeast 0(5], and 



a certain rate of aneuploidy, leading to precise percentages of mosaic aneuploidy, is common in several 
mammals' embryos ( [7j chapter 10), including humans 0|9], much more common than the corresponding 
miscarriage rate would imply, at least if we extrapolate evidence from pigs embryos |10j . Similarly, it 
is speculated that the significant mosaic aneuploidy in adult human organs such as liver and brain is 
| instrumental to an increased plasticity and adaptability of such organs |llH12j . with [13] raising the 

possibility that the extensive aneuploidy in the embryo may transfer into similarly widespread copy 
number variations in all human tissues. 

The specificity of aneuploidy in its manifestations is striking: a defined percentage of aneuploid cells, 
and a confinement of higher aneuploidy to specific organs rather than uniform distribution across the 
body. These facts are suggestive of a fine tuned, even post-embryonic, use of aneuploidy rather than a 
simple byproduct of aberrant or sustained cell division. 

Since many types of cancers partially inherit the hierarchical structure of the tissues they have de- 
rived from and are assumed to be propagated by stem-like tumor cells [14], it is possible that increased 
aneuploidy rates are used actively, to the population advantage, to increase the adaptability of stem or 
fast-dividing progenitor cells. 

In this scenario a high chromosomal instability rate or gene copy-number variation, both resulting in 
mosaicism, would be the means to achieve enough genetic diversification, indeed we will refer to them as 
aneuploidy rate in this paper to emphasize even more their close link and we formally define aneuploidy 



2 



rate as the average probability that there is at least one new aneuploid modification per chromosome during 
cell replication. 

Observable levels of aneuploidy hold in adult cells |15j . even though much lower than in embryos. 
While this widespread aneuploidy could already originate during embryo development |13j . adult, non- 
transformed stem cells continue to have distinct levels of aneuploidy rates according to their type. For 
example, mesenchymal stem cells are likely to have very low aneuploidy rates [16] . while hepatocytes 
together with small intestine and pancreas cells display within-tissue extensive copy number variation 

(cnv) dang. 

These strikingly different aneuploidy rates among embryo stem cells and adult stem cells raise the 
possibility that the finely tuned high aneuploidy rate observed in embryos is adaptively regulated by some 
mechanism specific to stem cells, that is accidentally reactivated in cancer cells, a diversification factor. 

In Section 2 we broadly review existing literature on aneuploidy, collecting hints of the potentially 
positive impact of aneuploidy for complex organisms and cancer cells. We then suggest specific evidence 
for the existence of a diversification factor by reinterpreting, in Section 3, recent single cell analysis 
experimental work. 

The quasispecies model of evolution was introduced by Eigen in 1971 and has been applied in many 
different fields, on account of its usefulness as a general evolutionary model for error-prone sclf-replicativc 
systems |19j . Assuming aneuploidy rate is regulated in normal and cancer (stem) cells, quasispecies 
theory can be adapted to predict a maximal aneuploidy rate, an error threshold, after which each cancer 
subpopulation looses its identity, and therefore its ability to carry to future cells' generations its selectively 
advantageous genetic traits, what is referred to as error catastrophe [20] . 

In Section 4 we estimate an error threshold for aneuploidy rates in cancer cells. Unlike previous at- 
tempts to adapt quasispecies theory to cancer [21H27] . we do not recommend a specific alternative phe- 
nomenological model of cancer cells sub-populations dynamics. Instead, we completely refocus the qua- 
sispecies model on aneuploidy, by defining the notion of chromosomal master sequences, whose length 
is taken to be the sum of the copy numbers of each whole or fragmented chromosome, and by using 
aneuploidy rates in the calculation of the probability of precise reproduction of sequences. 

Aneuploidy in Normal and Cancer Stem Cells 

In normal cells the number of chromosomes and the total DNA content depends on the phase of the 
cell cycle [28) . On the contrary, cancer cells usually display aneuploidy, and their chromosome load is 
generally higher than normal cells. This feature of tumor cells is commonly associated with acquired 
resistance to various kinds of treatments such as radio- or chemotherapy [TH3l[29j. Nonetheless, it is not 
clear whether aneuploidy contributes to and even drives tumor development or it is deleterious, since 
individuals carrying an extra copy of chromosome 21 have a 50% lower probability of developing solid 
tumors than do individuals with the correct chromosome number |30il31| . However, alterations in the 
karyotype, i.e. the canonical number and structure of chromosomes, represent the major cause of mental 
retardation and miscarriages |32[l33j. The presence of constitutional aneuploidy, with a proportion of 
aneuploid cell >25% in most tissues, is characteristic of Mosaic Variegated Aneuploidy syndrome (MVA), 
a rare autosomal disorder [33]. Notably, MVA patients are affected by growth retardation, microcephaly 
and, among other developmental abnormalities, cancer predisposition [35] . 

Although aneuploidy is compatible with organism and cell viability, the presence of additional copies 
of chromosomes decreases the overall cellular fitness [36]. However, gain or loss of specific chromosomes 
results in a small subset of cells with increased rate of transformation [36[l37j. with recent evidence 
showing that aneuploidy is likely to promote cancer development [35]|3S]. In this scenario, aneuploidy 
could be a force driving cellular selection through a Darwinian process in which only cells with aneuploidy- 
driven adaptive traits, i.e. favourable changes in the dosage of specific sets of genes, overcome the 
microenvironmental challenges and survive [40|l41j. 
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Since aneuploidy is generally associated with cancer, its potentially advantageous role in a physio- 
logical context has not been taken into consideration. In developmental biology, aneuploidy is regarded 
as an inescapable consequence of the rapid alternation fo S and M phases during embryo cleavage [28] 
and has been associated to miscarriages and congenital defects. Countcrintuitively, the percentage of 
cells undergoing DNA replication in solid tumors, which are mostly aneuploid, varies between 2 to 8%, 
whereas a normal renewing epithelium such as the intestine exhibits a DNA replication index of approxi- 
mately 16% |42j . Moreover, the majority of trisomies and chromosomal aberrations which are associated 
to miscarriages and birth defects, are known to originate during oogenesis, not during the cleavage stage, 
and are frequently due to non-disjunction or anaphase lagging in maternal meiosis I, that takes place 
mostly in the fetal life of the mother [43]. And yet, aneuploidies do exist that can be compatible with 
viable pregnancies and include those associated with chromosomes 13, 18, 21, X and Y [H]. Defined 
threshold levels of aneuploidy that are compatible with life and yield mosaicism, are common in several 
mammals' embryos ( [7] chapter 10), including humans [HUH]. 

If our hypothesis of the existence of a diversification factor in stem cells is valid, a constitutional 
aneuploidy in the embryo would translate into widespread copy number variations in all human tissues 
[13j . Indeed, aneuploidy is well documented in the healthy adult human liver and brain, but is also 
detectable in skin, sperm and ovarian tissues from presumably normal individuals [12 , 18,45 47]. Two 
novel genome-wide association studies confirm the presence of unexpectedly high-frequency detectable 
levels of mosaicism in control human blood samples [ISISS] and even large CNVs have been recently 
detected 'within' tissues of the same individuals Q2] . 

Genetic differences in monozygotic (MZ) twins (those who stem from the same zygote) represent 
an attractive model for studying somatic variations that occur during early embryonic development. 
Monozygotic twins frequently carry different copy number profiles [50] and epigenetic marks |51| . Using 
single nucleotide polymorphism arrays and fluorescent in situ hybridization analysis on a pair of monozy- 
gotic twins it has been found that one twin had monosomy X (45, X) in 7% of proband nucleated blood 
cells, whereas the other twin had 45,X and 46, XY lineages, both present in 1% of her cells [52] . In a 
separate study, a high incidence of segmental uniparental disomies, complete trisomies and several large 
copy number variants in multiple subjects was demonstrated. In one individual, five out of six alterations 
tested were detected in both blood and bladder tissue [53], indicating an early developmental origin. 
Taken together, these findings suggest that the resulting mosaicism in the adult partially originate in the 
embryonic stage, but a significant fraction also derives from de novo somatic modifications. 

Occasionally the genome can be surprisingly tolerant to accommodate large copy number changes 
in apparently healthy subjects, raising the question whether this phenomenon only underlies a cell- 
replication error burden or it serves a physiological purpose and is instrumental to an increased plasticity 
and adaptability, which is fundamental to survive the continuous stream of environmental stresses. 

Indeed a clear association exists between maternal age and miscarriage rates |43j or between somatic 
mosaicism and ageing or cancer incidence f48l[49]. The positive correlation between age and elevated 
CNV profiles or clonal mosaicism holds true in MZ twins as well 50,51,54 . The canonical interpretation 
of these scientific data is that, during their lifespan, cells and tissues undergo a series of mutational events 
and accumulate genetic abnormalities, ultimately leading to ageing and cancer. If this interpretation is 
correct, the presence of high CNV in most somatic tissues should translate into a default high probability 
of cancer development, especially so for highly proliferating tissues, but obviously this is not the case. On 
the contrary, if analyzed in the context of adaptability, the presence of an increasing somatic mosaicism 
and within-tissue CNV during lifespan, rather than a side-effect, is suggestive of a genetic diversification 
induced by environmental challenges during an individual's life |55j . In fact, while germline genetic 
alterations are generally viewed as negative, a widespread somatic variation could be beneficial |56| . 
For example, in tissues that frequently encounter pathogens, CNVs that eliminate viral receptors might 
enhance host survival. Specific gains and losses of chromosomes harboring injury-resistance alleles in 
normal, nontransformcd hepatocytes may render them differentially resistant to chronic insults such as 
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viral hepatitis as well as alcohol- and fat-induced hepatitis [11] . Interestingly, mouse adult hepatocytes 
can increase and reduce their ploidy upon injury, thus resulting in a tremendous genetic heterogeneity 
and eventually leading to favorable cellular selection [T7] . 

Studies in yeast showed that ancuploidy can provide a strong selective advantage in response to 
multiple environmental stressors (571158] . Since chromosomes of higher cukaryotic genomes contain up 
to 12% of genes arranged in functional neighborhoods, with a high level of gene co-expression |59| and 
given that 5-10% of all genes are thought to be monoallelically expressed [60], even small changes in the 
total DNA content of a cell, i.e. low- level aneuploidy, arc likely to cause phenotypic consequences [55] . 
As an example, human developing brain has been proven to be a mosaic, with 30% of the cells being 
ancuploid and up to 1.45% frequency of aneuploidy per chromosome [T2l61j . The resulting 10% of healthy 
adult brain cells bearing abnormal chromosomal content could derive from the adaptive selection of three 
times the amount of ancuploid neuronal cells in the fetus and might explain the significant fraction 
of IQ-discordant monozygotic twins |62| . Similarly, most of the de novo CNVs in a set of provisional 
schizophrenia genes analyzed in MZ twins, have been shown to arise during developmental mitosis and 
arc likely to account for the discordance in MZ twins for a variety of diseases including schizophrenia [63] . 

Embryonic stem cells are believed to be the primary source of somatic mosaicism [13], indeed, although 
embryonic stem cells (ESCs) or induced pluripotcnt stem cells (iPSCs) accumulate specific chromosomal 
changes when cultured in vitro [64] , these latter changes alone cannot account for the genetic heterogeneity 
displayed by these cells [55]. It has been recently published that as much as 30% of the original normal 
fibroblast population from which iPSC were derived, presents somatic CNVs [T5], pointing to an ESC- 
specific genetic diversification program which is activated during development. Despite this strong, 
but indirect evidence for a wide-spread mosaicism in the human stromal population of skin, human 
multipotent skin-derived precursor cells are known to keep stable karyotype in culture [66] . Likewise, 
hematopoietic and gastric stem-like cells [67]l68] and Adipose-derived Stromal Cells [16] are known to 
maintain genetic stability in long-term culture. Recently, human cardiac stem cells have been isolated 
and they displayed long-term karyotipic stability in culture [69] . In another study in vitro human-derived 
adult mesenchymal stromal cells have been shown to acquire detectable aneuploidy that is not related to 
culture and might be donor-dependent [70] . 

A recent comprehensive analysis of chromosomal abnormalities in cultured human adult stem cells 
revealed that the genetic identity of mesenchymal, neural and hematopoietic stem cell changes during 
cell culture passages and specific aberrations confer growth advantage in a cell lineage-specific manner 
[71) . Conversely, previous studies confirmed the diploid nature of neural stem cells over 100 passages in 
culture [72]. 

Even though there is conflicting literature in this field, many cultured adult stem-like cells are reported 
to display genetic stability and the opposing results obtained by different groups might depend on the 
experimental settings, on the individuals from which adult stem cells were derived as well on the subset 
of adult stem cells. Taken together, these observation point to a basal genetic stability of adult stem 
cell that is not compatible with the presence of extensive basal levels of CNV in adult dividing tissues 
such as liver, small intestine and pancreas [18]. Remarkably, almost 80% of the CNVs detected in [T8] 
are found in gene sequences, pointing to a non-random distribution of the genetic modifications. If we 
assume that only ESCs have the potential to generate somatic mosaicism and that all somatic changes 
are age-related, then a consistent fraction of adult stem cells in renewing human organs should display 
the genetic marks of the transmitted CNV. Since most adult stem cells display normal karyotype, we 
envision a potential contribution of adult stem cells in physiologically and/or adoptively generating the 
somatic genetic heterogeneity that distinguishes adult tissues. In this regard, although ESCs and adult 
stem cells display strikingly different aneuploidy rates, they might both retain the ability to activate 
a molecular mechanism of genetic diversification in the presence of environmental stress. Therefore a 
chronic insult would result in abnormally sustained diversification, tremendously raising the probability 
that one or more ancuploid clones with tumorigenic properties would be positively selected. 
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As a matter of fact, it has been demonstrated that only a minor population of cells derived from a 
single cancer cell clone (diversified population from a single cell), could be responsible for drug resistance 
and, upon removal of the drug, the population spontaneously reverted to a sensitive state |73U74j . These 
evidences support the notion that tumor cells are endowed with a diversification potential. Indeed, 
clonal heterogeneity within tumors is the main cause of tumor dormancy and resistance to anti-cancer 
therapies [291175]. 

Many types of cancers partially inherit the hierarchical structure of the tissues they have derived 
from and are assumed to be propagated by stem-like tumor cells [M]. Therefore it is conceivable that 
cancer stem cells (CSCs) or any tumor-initiating cell employ increased aneuploidy rates actively, to the 
population advantage, to increase the overall fitness and adaptability of the tumor. 

It is worth noting that, to date and to our knowledge, only two studies assessed the relationship 
between aneuploidy and CSCs [76;, 77]. In the first paper, Kusumbe and Bapat evaluated the expression 
of stem-cell markers and the DNA content distribution of fluorcsccntly labeled ovarian cancer cells after 
subcutaneous injection into immunodeficient mice. The authors found that, unlike label-free tumor cells, 
the label-retaining (quiescent) cells displayed stem-cell markers and were embedded with a small fraction 
of aneuploid cells. Treatment with chemotherapy increased the percentage of quiescent cells in the overall 
population and selectively stimulated the proliferation of the aneuploid fraction, which retained sternness 
properties upon removal of the drug [76] . A second study by Fujimori et al., reveals again that stressful 
conditions favor the emergence of CSC-like clones from differentiating embryonic stem cells in in vitro 
culture [77] . 

We hypothesize that genetic heterogeneity is obtained in ESCs, adult stem cells and CSCs through 
increased aneuploidy rates and, by revisiting a recent work in Section 3, we suggest some evidence of 
the existence of a diversification factor. In Section 4, in the context of a controlled diversification due to 
variable aneuploidy rates, we show how to interpret quasispecies theory to predict a maximal aneuploidy 
rate, an error threshold, after which a dominant cancer subpopulation looses its identity and therefore 
its fitness, what is referred to as error catastrophe [20] . 

Evidence for Adaptive Aneuploidy Rate 

As already reviewed in Section 2, the development of single-cell sequencing techniques [75] has recently 
allowed a wide range of studies analyzing variability in primary and metastatic tumor cells, as well as 
healthy tissues [47U79] . In this section we would like to revisit and comment some specific experimental 
evidence in light of our proposal, focusing on several measures of aneuploidy rate, and showing how 
the assumption of a diversification factor can give new meaning to the variability observed in tumor 
sub-populations subject to distinct micro-environments. 

The analysis in this section is phenomenological, meaning that we explain and reinterpret existing 
experimental work, showing how the inner logic of our argumentations can severely constraint the causes 
and interpretation of the data we review. We chose to perform an in-depth analysis of a single recent 
study [SDj . so that the flow of our discourse is unified and made coherent by constant reference to the 
same context. At the same time, we support our arguments with related experimental works, when 
appropriate. 

The main focus and objective of [80] was to show that metastatic tumors are likely to be the product 
of single clones proliferation from the primary tumor by observing the microvariation of the integer copy 
number of consensus sequences in individual tumor cells through single-cell sequencing. A coarse ploidy 
distribution of a large number of cells from a breast tumor and one of its metastasis was plotted in ( [80] 
figure 3a-b) as an histogram with respect to the total DNA content. These ploidy distributions showed, 
for both primary and metastatic tumors, two peaks, one around twice and another at four times the total 
amount of DNA, this double peaked distribution accounted by the presence, in each tumor tissue sample, 
of roughly 50% of normal diploid cells. Importantly, whereas in the primary tumor a significant fraction 
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of the gated normal cells was pscudodiploid, in the paired metastasis the normal population was likely 
to derive from the stromal content of the tissue ( [50] figure 4). 

The authors of [80. performed very refined measurements of copy number profiles, across all chromo- 
somes, from a small subset of cells (hundreds) in sections of primary and metastatic tumors and generated 
a neighbor-joining tree of these profiles. This analysis showed that metastatic and primary aneuploid 
cells were closely related in the neighbor-joining tree derived from the clustering, and yet they produced 
clearly distinct sub-clusters. The conclusion of this single cell study was that the metastasis proliferated 
from a single cell derived from the aneuploid subpopulation of the primary tumor, since no pseudodiploid 
cancer cells were observed in the metastatic tumor. 

Other recent studies performed in myeloproliferative disorders [81] and melanoma [82] , kidney [831184] 
and pancreas |85j tumors or, again, in breast cancer 86 , arrived at similar conclusions, pointing to a 
late, metastasis-specific diversification of primary tumor-derived cells ( [ST]; figure S14), [55]. In |84| . 
the presence of aneuploid cells in most primary tumors examined, is well documented by ploidy analyses 
in supplementary figure 10 and metastases show a marked increase in allelic imbalance as compared to 
primary tumor regions. The authors from |84] conclude that tumor heterogeneity is probably driven by 
ancuploidy and that chromosomal aberrations contribute substantially to genetic intratumor heterogene- 
ity. Notably, as a further evidence of the robustness of our reasoning, even in an evolutionary context 
where the primary tumor and the metastasis share most of the sequenced regions, there is a striking 
variation in copy number specifically in the metastatic counterpart ( [55], figures S5 to Sll). 

This concept is exemplified in [50], but here, both the coarse ploidy distribution analysis and the 
refined, single cell copy number count for primary and metastatic tumors, drive us to a more powerful 
conclusion: the genetic variability of the aneuploid clone in the metastasis is greater than its corresponding- 
variability in the primary tumor from which it came. Even if we took into account a parallel progression 
model as opposed to a punctuated or linear evolution [89|l90] , the final result would not change. Regardless 
of whether the metastasis developed in a later, much shorter time than the primary tumor or whether 
its origins date back to the first stages of primary tumor dissemination, the metastatic population in 
the paper from Navin et al. has diversified more than its parental population in the primary tumor, 
even though the aneuploid cells compartment in the primary tumor analyzed by Navin et al. does not 
represent a minority of the population, but it has itself expanded considerably at some point during the 
tumor evolution history. 

To justify our claim, we note that the Euclidean distances in the neighbor-joining tree for the aneuploid 
cells from the metastatic tumor studied in [80] showed much greater variability than the Euclidean 
distances for the corresponding aneuploid cells in the primary tumoiQ. Granted that the cited study 
dealt with very small sample populations, but a closer inspection and analysis of the tightness of the 
variance of Euclidean distances in the primary tumor subpopulation, as opposed to the variance of 
Euclidean distances of the metastatic tumor subpopulation, would almost certainly reveal a statistically 
significant discrimination of the two. 

The hypothesis of a larger variability of the copy number profiles of the metastatic subpopulation is 
clearly supported by a closer inspection of the tails of the ploidy distributions of primary and metastatic 
tumor populations analyzed in [80] . The right-hand sides of the tetraploid peaks for the metastatic tumor 
have distinctly thicker and longer tails than the corresponding tetraploid peaks in the primary tumor, 
suggesting greater variability of aneuploidy in the former. Other works point to a similar conclusion, 
for example, it has been shown recently that, although sharing most of the examined somatic singlc- 
nucleotide variants, in vitro cultured low-passage melanoma cells have higher copy number variation 
when compared to the parental tumor [91] . 

Ploidy distributions, in their simplicity, offer even more scope for interpretation and testing of the 
hypothesis that cancer cells have the ability to self-regulate their aneuploidy rate. Indeed, a single 



lr These distances were calculated with respect to a common root profile, mutual distances among individual profiles are 
likely to be ever greater 
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cell clone could be capable of generating a diverse metastasis either because of inherent chromosomal 
instability, or because its rate of aneuploidy is somehow increased under the stress conditions of a new 
tissue embedding. 

Let's assume first that the single clone from the primary tumor has chromosomal instability. The 
dispersion of metastatic cells should not be in any way preferential to such cells (even if their successful 
embedding in a tissue may be), so there will be a, possibly small, sub-population of cells in the primary 
tumor with similar or higher aneuploidy rate than the cell generating the metastasis. This subpopulation 
of the primary tumor, by its greater aneuploidy rate, will be more adaptable and likely self-sustaining, 
and it should be observable as a long tail in the ploidy distribution of the primary tumor population, the 
tail will be much thinner for the primary tumor than the metastatic tumor, since high aneuploidy rate 
cells are only a sub-population of the primary tumor. However, no such long and thin tail is observed 
experimentally, for the primary tumor, in J80f . It is still possible, if highly unlikely, that the metastatic 
cell is an extreme outlier, with no comparable cells left in the primary tumor, but then we would see a 
much more pronounced evolutionary difference between primary and metastatic ancuploid populations 
than what is observed. The only logical conclusion is that the single metastatic cell clone was not 
essentially different from its primary population before starting to proliferate in the new environment. 

This argument leaves only one other option: that some cancer cells are capable of altering and adapting 
their aneuploidy rate under stress, or under changes in the environment. 



Aneuploid Error Threshold 

The concept of quasispecies was first introduced in |92U93j , and it is a powerful way to relate the structure 
of population dynamics to the error rate of single base replication in viruses or unicellular organisms 
[20j[94] . The most important consequence of the theory is that it is possible to determine theoretically 
a threshold on the error rate such that, if the error rate of replication of genomic sequences is pushed 
above the threshold, the subpopulation will not be able to retain its identity, under mild conditions on 
subpopulations interactions and fitness distribution [95]. Here we sketch the argument that leads to the 
error threshold inequality and we refer to [^nUMlISS] for details. 

Assuming there are TV subpopulation types within a population, we start by writing down the differ- 
ential equation that describes the rate of change dx m of type m in terms of the instantaneous size Xj(t), 
j = 1, TV of all types: 

^ = (W mm - E(t))x m (t) + ]T W ik x k (t), (1) 

W mm is the rate of effective excess production of of sub-population type m, and if we consider the genetic 
sequence associated to type m, we can write W mm = Q m m.A m — D m , with Q m m the probability of precise 
reproduction of sequence m, A m the growth rate of type m, and D m its mortality rate. E(t) — ^ k E k x k 
is the average, over all types, of the excess reproduction rate, with E k = A{ — Di and, finally, W mk is the 
rate of production of type m by erroneous reproduction of type k. 

Assuming a steady state in which dx m /dt = and neglecting in first approximation the contributions 
Sfc^m Wmk®k(t)i it is possible to derive a condition that constraint the probability of precise reproduction 
of sequence m: 

^niQmm ^ I5 (2) 

where a m = A m j {D m +E k ^ m ), with E k ^ m = Y^k^m Ek%k, is the average superiority of a master sequence 
associated to a dominant subpopulation versus competitor sequences, essentially, a m is an index of relative 
fitness [20] . If a master sequence has length v m , and we denote by q the average fidelity of single nucleotide 
reproduction, then Q mm = <? Um and the error threshold can be written as 



(3) 



8 



Remarkably, Eq. 3 establishes a phase transition on the information content, if the error rate of single 
nucleotide reproduction goes above 1 " cr "' the information contained in the master sequence will disin- 
tegrate, in the sense that the loss of information in the sequence due to reproduction errors will not 
be compensated by a sufficiently high fitness relative to other subpopulations and the subpopulation 
associated to the master sequence will implode [201195] . 

In complex organisms, the quasispecics model is potentially applicable only in specific scenarios, 
such as competition among embryo stem cells during development, adult stem cells and progenitor cells 
proliferation, and, crucially, cancer cells, where subpopulations compete with each other under limited 
resources and changing environment. However, the applicability of the basic quasispecies model, originally 
devised in the setting of virus RNA replication, has been put into question as appropriate for eucaryotes 
and specifically for cancer cells. Eucaryotic cells reproduce semi-conservatively meaning that the parental 
double strand degenerates in the process of generating two daughter double strands, making the authors 
of [21] raise the possibility that, for high enough replication error rates, the master sequence, seen here 
as the double strand of DNA, would eventually disappear, and they suggested more refined quasispecics 
models that take into consideration this phenomenon. 

Even more seriously, the applicability of quasispecics theory to human cells is put into question by 
the exceedingly high size of the human genome as compared to RNA viruses. In fact, in order for the 
quasispecies not to undergo genetic drift, the neutral space around a fitness peak should be sufficiently 
small to be completely explored by the population. The complexity and the inherent mutational and 
phenotypical robustness of human genome amplifies its neutral space, preventing quasispecies evolution 
even at higher than normal mutation rates as in cancer cells |97j . This fact, together with very low single 
nucleotide errors for humans, implies that the fitness of mutants of an hypothetical master sequence does 
not change significantly, and the fitness distribution of mutants around the master sequence is likely to 
decay linearly, or sub-linearly, a scenario under which no error threshold is possible [95] . 

Indeed, the existence of the Lynch syndrome or hereditary non-polyposis colorectal carcinomas (HN- 
PCC), which arc characterized by a higher risk of incidence of colon cancer, show the inability of the basic 
quasispecies theory to predict the maximum single nucleotide error that is viable for a tumor. HNPCC 
tumors, as well as all micrositellite instable (MSI) colon cancers, arise because of a break down of the 
mismatch repair mechanism J98]|99]. Therefore, MSI cancer cells display increased error rates of single 
nucleotide replication by 1 to 3 orders of magnitude, with respect to the baseline probability 1 — q to 
have a single nucleotide error in healthy cells, estimated for human genome to range between 10 -9 and 
1CT 10 ( [100] page 271, [ToTlfT03"] '>. 

Now recall that the human genome has roughly 3.2 * 10 9 nucleotides ( jlOUj page 206), and note that 
for organisms with very large genomes, the relative superiority a m of a master sequence associated to a 
given subpopulation cannot be very large (vis a vis other subpopulations), as any given mutation will 
only affect marginally its fitness [20| . and therefore a m « 1. Given these numerical estimates, according 
to the error threshold inequality in Eq. 3, MSI tumors would fail to satisfy the error threshold inequality 
to such an extent that they should not even exist. This is true even if we restrict our attention, in defining 
the master sequence, to conserved DNA, i.e. the 5% of the human genome that is known to be coding 
and essential to cell function (sec again [1001 , page 206). 

We believe that these inconsistences of the basic quasispecies model, when applied to human cells, 
completely disappear if we replace single nucleotide errors with aneuploidy errors. Notably, in all scenarios 
where quasispecies theory could potentially apply, i.e. stem and progenitor cells proliferation and cancer 
cells, aneuploidy rates far exceed single nucleotide error rates in frequency and impact on the cell, so 
that, effectively, the leading cause in the evolution of a population will be the aneuploidy error, rather 
than the single nucleotide error, that can be neglected, especially when the mismatch repair genes are 
intact as it happens in the overwhelming majority of cancers and all healthy stem cells. 

We need now to reinterpret the notion of fidelity of reproduction of a sequence adequately to properly 
define error thresholds in the presence of aneuploidy. Since we can neglect nucleotide errors, we assume 
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a faithful reproduction of the genetic material when the copy number of each chromosome in a sequence 
(two, for example, in a diploid cell) is kept constant during replication, both numerically (number of 
physically distinct whole chromosomes or fragments) and structurally (translocations, deletions and am- 
plifications of DNA). Although complex ancuploidy landscapes may arise, characterized by concurrent 
numerical and structural chromosomal changes, most of the somatic copy-number alterations (SCNAs) 
frequently found in tumor cells involve whole chromosomes or whole-arms (25% of the genome), with 
only 10% of the cancer cell genome being affected by focal SCNAs j4] H04lH05] . 

Therefore, we define now the chromosomal master sequence length c m of a cell as the total number of all 
whole and fragmented chromosomes in its nucleus, and we define the aneuploid fidelity A m as the average 
probability that each whole or fragmented chromosome is reproduced exactly once in cell division, with 
no gain or loss of sub-chromosomal regions. In this aneuploid scenario, the chromosomal master sequence 
length c m can fluctuate depending on the number of aneuploid copies of whole chromosomes or fragments, 
and the underlying nucleotide sequence will clearly differ according to which chromosomes or individual 
genes are affected by copy-number alterations in each cell. Although tumors vary widely in the number 
and type of copy number changes, most of these comprise low-level alterations and only a few genes reach 
more than 20 copy numbers, mainly due to their oncogenic or drug-resistance functions [80,106 108 . 

Aneuploid events can cause large phenotypical variations [55 57, 109], even a single error leading to 
chromosomal loss or addition can have large effects, therefore the fitness distribution around a master 
sequence is expected to display a sharp decay from the master sequence peak, in line with the types of 
fitness distribution known to express the error threshold [SS]. At least for cancer cells, sub-populations 
are sharply defined in terms of their aneuploid profile, as evinced from the single cell analysis works [80] 
commented in Section 3, this is further evidence of the strong concentration of fitness distributions around 
a few chromosomal sequence types. 

Mutants of the master sequence, generated by even a single aneuploid error, and individual cells 
belonging to other sub-populations, are exceedingly unlikely to be able to mutate into cells expressing 
the master sequence, since any additional (erroneous) chromosome copy is subject to a wide variety 
of further partial deletions/additions, and only very few of them would correspond to a return to the 
master sequence configuration. Essentially, we can assume that the contribution of cells belonging to 
other subpopulation types to the dynamical evolution of the master sequence subpopulation is very 
small, and this is exactly the condition that led to the error threshold in the first place, since Eq. 3 is 
derived as a limiting stationary behavior of an interacting family of subpopulations described by Eq. 1, 
where the rate of growth of each of them is weakly affected by the cross-mutations derived from the other 
subpopulations [20 1 195 ] , 

Given these caveats regarding distribution of fitness for chromosomal master sequences in the presence 
of aneuploidy and regarding sub-population interactions, we reach the conclusion that quasispecies theory 
is indeed applicable to cancer and stem cells, but only in the context of aneuploid chromosomal master 
sequences, neglecting the underlying nucleotide errors. 

We can now replace variables in the error threshold inequality Eq. 3 to take into account not only 
the variable length c m of the chromosomal master sequence associated to all whole and fragmented 
chromosomes, but also the correction to the probability Q mm of precise reproduction of a sequence that 
aneuploidy entails. The probability of precise reproduction of a specific sequence of chromosomes m can 
be expressed as Q mm = A c ^ , and the aneuploid error threshold inequality can be written as o m A c ^ > 1, 
which gives us an aneuploid error threshold inequality formally identical to Eq. 3: 

l ^>l-A m , (4) 

Cm 

however, each term in this equation has drastically different orders of magnitudes than the threshold 
inequality for DNA or RNA master sequences. To start with, as already stressed above, the relative 
superiority a m will have considerable fluctuations, since the chromosomal master sequence is much shorter 
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than a nucleotide sequence, and even small variations in copy numbers can effect large phenotypical 
variations. 

At the same time, the diversity of subpopulations in primary tumors ,80,83 86,90] implies that, in 
a fully developed tumor, different subpopulations do not have extremely different relative superiority 
a m , a scenario that would lead to a single, highly dominant subpopulation. It is therefore reasonable to 
assume at the very most a m £ [10 2 , 10 3 ], for the subpopulations of highest relative superiority. We know 
moreover that highly ancuploid tumors have higher fitness |110j . so larger values of a m are likely to be 
associated with large values of c m , up to the order of 10 2 [78,82, 105 11111] . 

Write E m = 1 — A m in Eq. 4, with E m the aneuploidy error rate, i.e. the average probability that 
there is at least one new aneuploid defect for each chromosome or fragment of chromosome during cell 
replication. If we call T(E m ) the threshold aneuploidy error rate above which a chromosomal master 
sequence is not viable, and if we take <r m £ [10 2 , 10 3 ], c m ~ 10 2 , then Eq. 4 gives T(E m ) w 10 -2 , which 
is consistent with the estimates of E m for cancer cells, in the range [10~ 3 , 10 _1 ], derived in [2"l lll2ffTT5] . 
Our argument implies that the more a cell is aneuploid, the tighter the error threshold bound is, and 
that highly aneuploid cancer cells, known to be most adaptable [29lll lOlTl 16U1 17] . are already working 
with aneuploid error rates close to the limit of a viable quasispecies. 

There is some evidence that indeed aneuploidy rates in the tumor can affect the prognosis of cancer 
patients, in [4] and |118j it is suggested that a moderate tumor aneuploidy rate worsen the prognosis, 
while a very high aneuploidy rate is associated in [J with improved patient outcome, consistently with 
the quasispecies and error threshold catastrophe approach. 

A Diversification Factor 

In Section 3 we inferred from some experimental results of |80j and other published supportive information 
JHHIH11.87!, that metastatic cancer cells have higher aneuploidy rates than the corresponding original 
subpopulation of the primary tumor and we concluded that this differential could only be explained by 
assuming an adaptive cellular response sensitive to changes in the environment. 

Moreover, as pointed out in Section 1, high aneuploidy rates are common in cancer cells and embryo 
stem cells, and, to a much lower degree in some types of adult stem cells, showing again a fine-tuned 
differential in the degree of diversification generated by aneuploidy rates. Taken together, these observa- 
tions imply that adaptive aneuploidy is regulated through a specific cellular mechanism, a diversification 
factor, already active in embryo and adult stem cells to variable degrees, and then excessively reactivated 
in cancer cells. 

The duration and degree of activation of a diversification factor, dependent on the phenotype of 
cultured cells and/or culture conditions and manipulation, might underlie the conflicting results in the 
measurement of aneuploidy in adult stem cell cultures (Section 1). Since failed cytokines or multipolar 
mitosis (resulting in an increase or a reduction of chromosomes, respectively) cause numerical chromosome 
abnormalities, but also uniparental chromosome sets, it could be that controlled expression of protein(s) 
involved in determining/controlling the segregation of chromosomes, would result in a certain degree of 
aneuploidy in a population of cells. Bialletic mutation of the BUB IB gene results in constitutional aneu- 
ploidy [33] and a near-diploid aneuploidization has been experimentally induced by overexpression of the 
BUB1 protein, both genes coding for key proteins involved in the anaphase checkpoint machinery |119) . 
Similarly, transient induction of the mitotic checkpoint gene Mad2 leads to chromosomal instability and 
to tumor relapse |120] . Another component of the mammalian mitotic checkpoint, the Centromere- 
associated protein-E (CENP-E), is essential to prevent aneuploidy |121] . Finally, in a recent paper, three 
new genes located on chromosome 18q, have been identified that contribute to suppress chromosomal in- 
stability and are subject to frequent copy number loss in colorectal cancers characterized by chromosomal 
instability |122) . Therefore adaptive aneuploidy could be driven by timed and controlled gene expression 
of a specific protein or a set of proteins regulating the mitotic spindle assembly, centrosome duplication 
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and cell-cycle checkpoints. As a matter of fact, inactivating mutations in the STAG2 gene (encoding 
a component of the cohesin complex required for normal chromosome segregation) is known to induce 
aneuploidy [55] . 

Interestingly, DNA hypomcthylation has been associated with aneuploidy and cancer and post- 
transcriptional silencing of the DNA-methyltransferases protein DNMT1 has recently proven to induce 
aneuploidy in fibroblasts and colon cancer cells |123j . A Recent study focused on the high frequency mu- 
tations occuring at non-methylated cytosines throughout the genome of many breast cancers and found 
that overcxprcssion of DNA cytosine deaminase APOBEC3B correlates with mutation rate |124j . 

The mutator phenotype is a common feature of cancer cells |125H126] and, although promoting the 
insurgencc of driver mutations, many deleterious passenger mutations accumulate during tumor evolu- 
tion |127j . Buffering mechanisms such as the proteasome or the chaperone systems reduce the detrimental 
effects of passenger mutations and copy number alterations [128] . In this regard it is worth citing the po- 
tentially relevant role of HSP90 in adaptive aneuploidy. HSP90 is a molecular chaperone conserved from 
plants to mammals, that assists metastable client proteins and helps them maintaining their correct con- 
formation or refolding after mutational events or stress- induced denaturation |129j . Due to its pleiotropic 
and fundamental function, HSP90 has been regarded as an evolutionary "capacitor", i.e. it buffers ge- 
netic variation (in terms of mutations or copy number) resulting in phenotypic robustness |130j . Many 
developmental and metabolic phenotypes are threshold traits where the aberrant phenotype is triggered 
when the disease-associated factor falls below a crucial level, and this level may differ between species. 
The cryptic variants that under normal conditions are buffered by HSP90, would eventually be exposed 
and tested by natural selection in a stressful environment where HSP90 function is compromised |131j . 
In such cases, HSP90 would act on the pre-stored genetic diversity, but it has recently been correlated to 
the insurgence of aneuploidy and consequently increased adaptability under stress conditions in a yeast 
model [6]. 

Due to the arrangement of eukaryotic genes in functional neighborhoods inside chromosomes and 
due to the high level of gene co-expression and monoallelic expression, even small changes in the total 
DNA content of a cell, i.e. low-level aneuploidy, are likely to result in a big phenotypic leap and enable 
the cell to explore a wide region of phenotypic landscapes. Thus, by altering the dosage of regulatory 
factors, aneuploidy can cause broad gene expression changes well beyond a direct DNA dosage effect. 
In relatively small populations under a strong selective force, the number of mutations with sufficient 
phenotypic effect to achieve adaptation is limited and increase in aneuploidy rate would certainly result in 
favorable cellular selection. Certain somatic evolutionary processes, such as the clonal expansion in early 
tumor/metastasis progression or relapse after drug treatment, may fall into this category. Therefore, the 
great ability of tumor metastases to resist therapeutic treatments is intimately connected to their higher 
copy number alterations as compared to primary tumors and this confirms our interpretation of the most 
recent literature. 

Finally, starvation and related intercellular signaling, could be a possible direct or indirect trigger 
of adaptive aneuploidy because it is potentially at work in all organisms where variable aneuploidy is 
known to play a positive, adaptive effect, including plants, unicellular organisms and animals. Indeed, 
both replication stress and nucleotide deficiency is associated with genomic instability |122|ll32j . Elevated 
cell replication rates in the presence of less than optimal developmental conditions for the embryo would 
justify an increased microvariation of ploidy |13j to foster adaptation, and even in the adult organism 
several organs may be amenable to continuing adaptation through variable aneuploidy rates [TTl[P2"I[T7] . 
Our point is that only by taking a broad view that puts aneuploidy rate in its adaptive and evolutionary 
context, understanding its role across species and across states of embryo development, we can hope to 
identify the potential causes of adaptive aneuploidy in cancer. 

A validation of the conjecture that cancer cells can adapt their aneuploidy rate through a diversifica- 
tion factor would have significant therapeutical implications, as it would provide a biological way, mostly 
inactive or less sensitive in healthy adult cells, to apply quasispecies error catastrophe strategies, along 
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the lines of Section 4, to weaken cancer populations, a long held hope that may yet prove itself true. 
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